symmetry group
Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels
Bernal, Marcel Tomàs, Mallinar, Neil Rohit, Belkin, Mikhail
Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product (AGOP) of an estimator in order to learn task-relevant features. Our main experimental finding is that generalization occurs only when a certain symmetry in the training set is broken. Furthermore, we empirically show that RFM generalizes by recovering the underlying invariance group action inherent in the data. We find that the learned feature matrices encode specific elements of the invariance group, explaining the dependence of generalization on symmetry.
- North America > United States (0.28)
- Africa > Middle East > Morocco > Tanger-Tetouan-Al Hoceima Region > Tangier (0.04)
- North America > United States > Utah > Cache County > Logan (0.04)
- North America > United States > New York > Albany County > Albany (0.04)
- Antarctica (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Telecommunications (0.41)
- Semiconductors & Electronics (0.41)
- North America > Canada > Quebec > Montreal (0.15)
- North America > United States (0.14)
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- North America > United States > California > Los Angeles County (0.04)
- Europe > Poland (0.04)
- Asia > Middle East > Israel (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- North America > Canada > British Columbia > Regional District of Central Okanagan > Kelowna (0.04)
- Europe > Poland (0.04)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
e449b9317dad920c0dd5ad0a2a2d5e49-Paper.pdf
In the natural sciences, physics has found great success by describing the universe in terms of symmetry preserving transformations. Inspired by this formalism, we propose a framework, built upon the theory of group representation, for learning representations of a dynamical environment structured around the transformations that generate its evolution. Experimentally, we learn the structure of explicitly symmetric environments without supervision from observational data generated by sequential interactions.
- Europe > France > Île-de-France > Paris > Paris (0.05)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)